Comparing Computational Models of Selectional Preferences - Second-order Co-Occurrence vs. Latent Semantic Clusters

نویسنده

  • Sabine Schulte im Walde
چکیده

This paper presents a comparison of three computational approaches to selectional preferences: (i) an intuitive distributional approach that uses second-order co-occurrence of predicates and complement properties; (ii) an EM-based clustering approach that models the strengths of predicate–noun relationships by latent semantic clusters; and (iii) an extension of the latent semantic clusters by incorporating the MDL principle into the EM training, thus explicitly modelling the predicate–noun selectional preferences by WordNet classes. We describe various experiments on German data and two evaluations, and demonstrate that the simple distributional model outperforms the more complex cluster-based models in most cases, but does itself not always beat the powerful frequency baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dgfs-cl Comparing Computational Models of Selectional Preferences – Second-order Co-occurrence vs. Latent Semantic Clusters

Selectional preferences (i.e., semantic restrictions on the realisation of predicate complements) are of great interest to research in Computational Linguistics, both from a lexicographic and from an applied (wrt data sparseness) perspective. This poster presents a comparison of three computational approaches to selectional preferences: (i) an intuitive distributional approach that uses second-...

متن کامل

Computational Models for Chinese Selectional Preferences Induction

Selectional preference (SP) is an important kind of semantic knowledge. It can be used in various natural language processing tasks, including metaphor computing, lexicon building, syntactic structure disambiguation, word sense disambiguation, semantic role labeling, anaphora resolution, etc. This paper presents and compares two computational models for Chinese SP induction, a HowNet-based Sele...

متن کامل

Probabilistic Distributional Semantics with Latent Variable Models

We describe a probabilistic framework for acquiring selectional preferences of linguistic predicates and for using the acquired representations to model the effects of context on word meaning. Our framework uses Bayesian latent-variable models inspired by, and extending, the well-known Latent Dirichlet Allocation (LDA) model of topical structure in documents; when applied to predicate–argument ...

متن کامل

The Impact of Selectional Preference Agreement on Semantic Relational Similarity

Relational similarity is essential to analogical reasoning. Automatically determining the degree to which a pair of words belongs to a semantic relation (relational similarity) is greatly improved by considering the selectional preferences of the relation. To determine selectional preferences, we induced semantic classes through a Latent Dirichlet Allocation (LDA) method that operates on depend...

متن کامل

Latent Semantic Clustering of German Verbs with Treebank Data

Treebank data have been utilized as data sources for a wide range of tasks in computational linguistics, including statistical parsing, anaphora resolution, induction of valence lexica, etc. More recently, researchers have experimented with extracting semantic information from syntactically annotated data. Here, treebank data have been used for the purposes of identifying selectional preference...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010